MProfiler: A Profile-Based Method for DNA Motif Discovery
نویسندگان
چکیده
Motif Finding is one of the important tasks in gene regulation which is essential in understanding biological cell functions. Based on Tompa et al. study, the performance of current motif finders is not satisfactory. A number of ensemble methods has been proposed to enhance the results. Existing ensemble methods overall performance is better than stand-alone motif finders. A recent ensemble method, MotifVoter, significantly outperforms all existing stand-alone and ensemble methods. In this paper we propose a method, MProfiler, to increase the accuracy of MotifVoter without increasing the running time by introducing an idea called center profiling. Our test shows improvement in the quality of generated clusters over MotifVoter in both accuracy and cluster compactness. Using 56 datasets, the accuracy of the final results using our method has 80% improvement in correlation coefficient nCC, and 93% improvement in performance coefficient nPC over MotifVoter.
منابع مشابه
Development of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملHybrid Gibbs-sampling algorithm for challenging motif discovery: GibbsDST.
The difficulties of computational discovery of transcription factor binding sites (TFBS) are well represented by (l, d) planted motif challenge problems. Large d problems are difficult, particularly for profile-based motif discovery algorithms. Their local search in the profile space is apparently incompatible with subtle motifs and large mutational distances between the motif occurrences. Here...
متن کاملA profile-based deterministic sequential Monte Carlo algorithm for motif discovery
MOTIVATION Conserved motifs often represent biological significance, providing insight on biological aspects such as gene transcription regulation, biomolecular secondary structure, presence of non-coding RNAs and evolution history. With the increasing number of sequenced genomic data, faster and more accurate tools are needed to automate the process of motif discovery. RESULTS We propose a d...
متن کاملDNA Motif Discovery Based on Ant Colony Optimization and Expectation Maximization
The identification of transcription factor binding sites (TFBSs) is important for understanding the genetic regulatory system, but weak conservation of TFBSs poses a challenge in computational biology. In this study, we propose a method based on the Ant Colony Optimization (ACO) and Expectation Maximization (EM) algorithm to discover DNA motifs (collections of TFBSs) in a set of bio-sequences. ...
متن کاملGenome-wide discovery of transcriptional modules from DNA sequence and gene expression
In this paper, we describe an approach for understanding transcriptional regulation from both gene expression and promoter sequence data. We aim to identify transcriptional modules--sets of genes that are co-regulated in a set of experiments, through a common motif profile. Using the EM algorithm, our approach refines both the module assignment and the motif profile so as to best explain the ex...
متن کامل